Goto

Collaborating Authors

 logo detection


Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis

Hosseini, Alireza, Hooshanfar, Kiana, Omrani, Pouria, Toosi, Reza, Toosi, Ramin, Ebrahimian, Zahra, Akhaee, Mohammad Ali

arXiv.org Artificial Intelligence

In the highly competitive area of product marketing, the visibility of brand logos on packaging plays a crucial role in shaping consumer perception, directly influencing the success of the product. This paper introduces a comprehensive framework to measure the brand logo's attention on a packaging design. The proposed method consists of three steps. The first step leverages YOLOv8 for precise logo detection across prominent datasets, FoodLogoDet-1500 and LogoDet-3K. The second step involves modeling the user's visual attention with a novel saliency prediction model tailored for the packaging context. The proposed saliency model combines the visual elements with text maps employing a transformers-based architecture to predict user attention maps. In the third step, by integrating logo detection with a saliency map generation, the framework provides a comprehensive brand attention score. The effectiveness of the proposed method is assessed module by module, ensuring a thorough evaluation of each component. Comparing logo detection and saliency map prediction with state-of-the-art models shows the superiority of the proposed methods. To investigate the robustness of the proposed brand attention score, we collected a unique dataset to examine previous psychophysical hypotheses related to brand visibility. the results show that the brand attention score is in line with all previous studies. Also, we introduced seven new hypotheses to check the impact of position, orientation, presence of person, and other visual elements on brand attention. This research marks a significant stride in the intersection of cognitive psychology, computer vision, and marketing, paving the way for advanced, consumer-centric packaging designs.

  Country:
  Genre: Research Report > New Finding (0.87)
  Industry:

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

Sharma, Nakul, Penamakuri, Abhirama S., Mishra, Anand

arXiv.org Artificial Intelligence

In this paper, we study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting. This problem setup is significantly more challenging than traditionally-studied 'closed-set' and 'large-scale training samples per category' logo recognition settings. We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos as well as the graphical design of the logos to learn robust contrastive representations. These representations are jointly learned for multiple views of logos over a batch and thereby they generalize well to unseen logos. We evaluate our proposed framework for cropped logo verification, cropped logo identification, and end-to-end logo identification in natural scene tasks; and compare it against state-of-the-art methods. Further, the literature lacks a 'very-large-scale' collection of reference logo images that can facilitate the study of one-hundred thousand-scale logo identification. To fill this gap in the literature, we introduce Wikidata Reference Logo Dataset (WiRLD), containing logos for 100K business brands harvested from Wikidata. Our proposed framework that achieves an area under the ROC curve of 91.3% on the QMUL-OpenLogo dataset for the verification task, outperforms state-of-the-art methods by 9.1% and 2.6% on the one-shot logo identification task on the Toplogos-10 and the FlickrLogos32 datasets, respectively. Further, we show that our method is more stable compared to other baselines even when the number of candidate logos is on a 100K scale.


Detecting Cloud-Based Phishing Attacks by Combining Deep Learning Models

Jha, Birendra, Atre, Medha, Rao, Ashwini

arXiv.org Artificial Intelligence

Web-based phishing attacks nowadays exploit popular cloud web hosting services and apps such as Google Sites and Typeform for hosting their attacks. Since these attacks originate from reputable domains and IP addresses of the cloud services, traditional phishing detection methods such as IP reputation monitoring and blacklisting are not very effective. Here we investigate the effectiveness of deep learning models in detecting this class of cloud-based phishing attacks. Specifically, we evaluate deep learning models for three phishing detection methods--LSTM model for URL analysis, YOLOv2 model for logo analysis, and triplet network model for visual similarity analysis. We train the models using well-known datasets and test their performance on cloud-based phishing attacks in the wild. Our results qualitatively explain why the models succeed or fail. Furthermore, our results highlight how combining results from the individual models can improve the effectiveness of detecting cloud-based phishing attacks.


Logo Detection Using PyTorch – Diving in Deep – Medium

#artificialintelligence

I wrote this blog to wrap up my first ever public talk at PyCon Thailand 2018 and to add some more details . Advertising technology, commonly known as "Ad Tech", has been used by brands, vendors, and agencies to analyze and get insights from potential customers' activities online. In the past year, machine learning and deep learning became a major tools for Ad Tech. For example, an image recognition system is used to identify the targets from brands, products, and logos on publicly posted images. The easiest way to identify brand from images is by its logo. In deep learning, it's all about create, train, and deploy network.


Transfer Learning with augmented Data for Logo Detection

#artificialintelligence

The last months, I have worked on brand logo detection in R with Keras. The goal is to build a (deep) neural net that is able to identify brand logos in images. Just to recall, the dataset is a combination of the Flickr27-dataset, with 270 images of 27 classes and self-scraped images from google image search. In case you want to reproduce the analysis, you can download the set here. In the last post, I used the VGG-16 pretrained model and showed that it can be trained to achieve an accuracy of 55% on the training 35% on the validation set.


We compared the 3 best image analysis API's -- here's what we learned

#artificialintelligence

One of the flagship applications of modern machine learning has been working with images -- training computers to analyze, classify, and alter different types of pictures. Last year, Google's DeepDream software made waves by creating a series of terrifying, nightmare-inducing images: While image and video classification (that is, telling us what is depicted in the chosen media) is not so cutting edge anymore, this is actually good news. A number of cheap services have sprung up to make image classification quite accessible. Here at MuseFind we've been looking for ways to help both brands and social media influencers produce better content -- and part of that effort is figuring out the correlations between the most successful posts and their content. Below are the top three image/video analysis services we considered: Clarifai, Google Cloud Vision, and Amazon Rekognition.


sbrugman/deep-learning-papers

#artificialintelligence

Papers about deep learning ordered by task, date. Current state-of-the-art papers are labelled. RenderGAN: Generating Realistic Labeled Data, nov 2016, arxiv Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, feb 2016, arxiv SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0.5MB model size, feb 2016, arxiv Snapshot Ensembles: Train 1, Get M for Free, 2016, paper, github


Google Opens Cloud Vision API Beta to Entire Developer Community

#artificialintelligence

Today, Google announced the beta release of its Google Cloud Vision API. The API was designed to empower applications to both see and understand images submitted to the API. With powerful features such as label/entity detection, optical character recognition, safe search detection, facial detection, landmark detection, and logo detection; the Cloud Vision API gives applications unprecedented ability to comprehend the situation within an image. With the new API, Google enters a rapidly developing market where both startups and major enterprises are producing cutting edge technology. From Microsoft, with its Project Oxford, to niche startups like Cognitec and Lambda Labs; image analysis is proving to be an attractive space as it appeals across industries from marketing to security.